Aftershock identification 
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Earthquake aftershock identification is closely related to the question "Are aftershocks different from 
the rest of earthquakes?" We give a positive answer to this question and introduce a general statistical 
procedure for clustering analysis of seismicity that can be used, in particular, for aftershock detection. The 
proposed approach expands the analysis of Baiesi and Paczuski [PRE, 69, 066106 (2004)] based on the 
space-time-magnitude nearest-neighbor distance r\ between earthquakes. We show that for a homogeneous 
Poisson marked point field with exponential marks, the distance r\ has Weibull distribution, which bridges 
our results with classical correlation analysis for unmarked point fields. We introduce a 2D distribution 
of spatial and temporal components of r\, which allows us to identify the clustered part of a point field. 
The proposed technique is applied to several synthetic seismicity models and to the observed seismicity 
of Southern California. 



PACS numbers: 91.30.Px, 91.30.P-, 91.30.Ab, 02.50.-r 

INTRODUCTION 

Earthquake clustering is the most prominent feature of 
the observed seismicity. The centennial world-wide ob- 
servations have revealed a wide variety of clustering phe- 
nomena that unfold in the time-space-magnitude domain 
(magnitude being the logarithmic measure of earthquake 
energy) and provide the most reliable and useful informa- 
tion about the essential properties of earthquake flow. Well- 
studied types of clustering include aftershocks, foreshocks, 
pairs of large earthquakes, swarms, bursts of aftershocks, 
rise of seismic activity prior to a large regional earthquake, 
switching of the global seismic activity between different 
parts of the Earth, etc. Single clustering phenomena and 
their combination are an essential element of understand- 
ing the seismic stress redistribution and lithosphere dynam- 
ics JI|], as well as constructing empirical earthquake predic- 
tion methods and evaluating regional seismic hazard Q]. 

Baiesi and Paczuski [3] have developed an elegant 
framework for studying earthquake clustering by defining 
the pairwise earthquake distance rjij via the expected num- 
ber of events in a particular time-space-magnitude domain 
bounded by events i and j. These authors used the dis- 
tance rjij to develop a tree -based statistical technique for 
earthquake cluster analysis and established several scaling 
laws for the observed earthquake clusters. 

We expand here the approach of Baiesi and Paczuski 
Jll] to demonstrate the existence of two statistically distinct 
subpopulations in the observed seismicity of Southern Cal- 
ifornia: One corresponds to a uniform, absolutely random 
flow of events while another to earthquake clustering. The 



earthquakes from the clustering part, by and large, obey 
the conventional definitions of aftershocks [4]. Our anal- 
ysis, therefore, provides an objective statistical foundation 
for aftershock identification that requires no prior cluster- 
ing parameters like space-time windows traditionally used 
for aftershock detection |4J]. 

Our finding is supported by theoretical and numerical 
analyses of several seismicity models, including ETAS 
15D. The main theoretical result is that for a homogeneous 
spatio-temporal Poisson field with independent exponen- 
tial magnitudes, the distance 7/ has Weibull distribution, the 
same distribution as the Euclidean nearest-neighbor dis- 
tance for a homogeneous point field. The proposed clus- 
ter detection technique is build upon the deviations of the 
observed nearest-neighbor distance 77 from this theoretical 
prediction. The key element of the applied analysis is the 
2D distribution of spatial and temporal components of 77; 
this distribution clearly separates the clustered and non- 
clustered parts of a point field. 



DISTANCE BETWEEN EARTHQUAKES 

Consider an earthquake catalog {tj , 0i, mi}j=i,...,jV' 
Each record i describes an individual earthquake with oc- 
currence time ij, position given by latitude 6^ and longitude 
(pi, and magnitude m^; here, we do not consider the depth. 

For any two earthquakes i and j we define the time- 
space-magnitude distance by 
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Here = tj — tj is the earthquake intercurrence time; ry 
surface distance; d is the fractal dimension of earthquake 
epicenters; and b is the parameter of Gutenberg-Richter re- 
lation (exponential fit to the magnitude distribution): 

P{m> x} = W- b ^- m ^I {x>mo} . (2) 
Connecting each event with its nearest neighbor with re- 
spect to the distance n one obtains a time-oriented tree T 
whose root is the first event in the catalog. Such trees were 
introduced and studied by Baiesi and Paczuski yfl. 

It is readily checked that the space-time volume of a 
ball of radius C in metric n, B c '■= {(t,x,y,m) : 
n(t, r, m) < C}, is infinite due to heavy tails of the dis- 
tance n in time when d > 2, in space when d < 2, and 
in both time and space for d = 2. Hence, any such ball 
almost surely contains an infinite number of events from 
N that prevents meaningful nearest-neighbor analysis. To 
avoid this, we introduce the truncated distance 
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(3) 



otherwise. 

Choosing t and r large enough will ensure that the mea- 
sures rj and n are equivalent within a bounded spatio- 
temporal area. The nearest-neighbor distance is defined as 
r)* := minj rjij. We will drop the subindices ij or j unless 
it is important which pair of earthquakes is considered. 



MAIN RESULT: POISSON FIELD 

Consider a spatio-temporal marked point field N with 
temporal component t G R, spatial component x € M 2 and 
scalar marks m that represent the earthquake magnitude. 
Below we formulate our main result, starting with essential 
assumptions about the field N. 

Assumption 1 (i) N is a homogeneous Poisson marked 
point field with intensity A. (ii) Magnitude marks are 
independent of the field (tj , Xj ) and each other and have 
exponential distribution (O with parameters b, rh . (iii) 
Let / = b/b and fi = lO^™ "" 1 ^ where b and m Q are 
the prior parameters of the Gutenberg-Richter law (O used 
in ©. 

Proposition 2 Under the Assumption [7] the nearest- 
neighbor distance rj* has the following distribution, for 
large t , r : 

( fx 
P{q* < x} = 1 - exp -A7 * 
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Here 7 is independent of x and we have 
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d<2,f< 1, 
d = 2,f<l, 
d>2,d>2f, 
d>2,d = 2f, 
d<2f,f>l, 
d<2,/ = l, 
d = 2,f = l, 



(5) 




FIG. 1: Distribution of time and space components, (T,R), of 
the nearest-neighbor distance r\ for homogeneous Poisson field 
with exponential magnitudes (a), single aftershock series obeying 
Omori law (b), ETAS model (c). 
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Proof will be published elsewhere. 

Proposition [2] implies that, for b 7^ b, d 7^ 2, and 
d 7^ 2/, r/* has Weibull distribution. Furthermore, the dis- 
tribution of rj* is independent of the magnitude threshold 
m , when the latter is known (which is obviously the case 
in practice). This facilitates analysis of data from different 
periods and regions that might have different m . 

Let earthquake i be the nearest neighbor for earthquake 
j, that is rjj = rjij. We define, for arbitrary < q < 1, 
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Obviously 7]* = T R (without loss of generality, we as- 
sumed here (7=1 and m = 0) and Proposition|2]implies 
that the distribution of the pair (T, R) is concentrated along 
the line log 10 T + log 10 R 



x n 



where x m is the mode 



of the distribution ((4|, while the level lines are of the form 
log 10 T + log 10 R =const. Figure illustrates this by 
showing the empirical distribution of the pairs (T, X) for 
a Poisson homogeneous field with exponential magnitudes. 



MODELED SEISMICITY 

Here we analyze numerically the distribution of nearest- 
neighbor distances 77* for three point field models: (i) ho- 
mogeneous Poisson marked field, (ii) single self-excited 
aftershock series governed by Omori law, and (iii) ETAS 
model that combines the first two. 

The Epidemic Type Aftershock Sequence (ETAS) model 
was introduced by Y. Ogata JD]; it specifies a marked point 
process N by its conditional intensity at instant t and spa- 
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tial location (x,y): 

A(t,x,y) = A + V 10 6 ^A T (r)A fl (r), (7) 



i:ti<t 
„2 _ 



where A > 0, r = t — t u r 2 = (x — Xi) 2 + (y — Vi) 2 , and 
the temporal (A T ) and spatial (A R ) kernels are given by JD] 
A T (t) = (t + c) _1 ~ er , A R (r) = (r + cfT 1 " 6 -" with pos- 
itive c, d, €t and e R . Magnitudes are drawn independently 
from the exponential distribution. 

A single aftershock series is a particular case of ETAS 
model with A replaced by 5(0, 0, 0) that represents the 
mainshock; its magnitude is a model parameter. 
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FIG. 2: Distribution of the nearest-neighbor distance -q for homo- 
geneous Poisson field with exponential magnitudes (top), single 
aftershock series obeying Omori law (middle), ETAS model (bot- 
tom). 

Figures Q] and [2] show the distributions of r/* and corre- 
sponding pairs (T, R). The Poisson model behaves as sug- 
gested by the Proposition [2] For a single aftershock series, 
one observes almost symmetric scatter, which suggests that 
T and R are independent. This is the most important dif- 
ference from the Poisson model. The ETAS distribution 
has two prominent "modes": A scatter along TR =const. 
in the upper right part of the plot and an apparently inde- 
pendent scatter closer to the origin. Evidently, combining 
the homogeneous Poisson flow and aftershock clustering 
we have combined as well the corresponding modes of the 
(T, R) distributions. 

OBSERVED SEISMICITY: SOUTHERN CALIFORNIA 

We use a Southern California earthquake catalog pro- 
duced by the Advance National Seismic System (ANSS) 
|@], and consider earthquakes with magnitude m > 2.0 
that fall within the square region bounded by 122° W, 
1U°W, 32° N, 37°N during January 1, 1984 - Decem- 
ber 31, 2004. 

The empirical distributions of the nearest-neighbor dis- 
tance rf and its components (T, R) are shown in Figs. 
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FIG. 3: Distribution of the nearest-neighbor distance r\ for the 
observed seismicity of Southern California during 1984-2004; 
different panels correspond to different lower magnitude cutoffs. 
Notice the bimodal structure with the same boundary between 
modes at r\ w 10~ 5 . 

Both distributions are prominently bimodal reminiscent of 
that observed for ETAS model; they reveal existence of two 
statistically distinct earthquake populations. One of them 
corresponds to log 10 T+log 10 R ~ 10 -3 ; according to the 
Proposition|2]it describes homogeneous (Poisson) seismic- 
ity. The other population corresponds to log 10 R ~ 10~ 2 ; 
it corresponds to the aftershock clustering. 
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FIG. 4: Distribution of time and space components, (T, R), of the 
nearest-neighbor distance -q for the observed seismicity of South- 
ern California during 1984-2004. Notice the bimodal structure; 



the location of a solid line log 10 T + log 10 R = 10 
in all panels. 



is the same 



To detect individual aftershocks, we fix a threshold 7] 
and remove all the links with r/* > rj from the tree T. This 

will result in the forest (set of trees) J~{r]o) = {%}i!}i ^- 
Each tree % in the forest corresponds to a single earthquake 
cluster: The distance between linked elements within any 
tree is smaller than that between any two elements from 
distinct trees. Those clusters can be further analyzed in 
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order to solve a particular applied problem. For example, 
aftershocks are often assumed to have smaller magnitude 
than the corresponding mainshocks Q4£| . Possible earth- 
quake clusters observed prior to the mainshock are then 
called foreshocks. In this situation, it is natural to define 
i-th mainshock as the largest earthquake within the tree %, 
and aftershocks (foreshocks) as the events from % that oc- 
curred later than (prior to) the mainshock. The results of 
this aftershock-detection procedure in California are shown 
in Fig. 12 here we used rj = 10~ 5 suggested by the dis- 
tribution of rf and (T, X) (Figs. [3PT l. The figure focuses 
on Landers earthquake, the largest one in California during 
the considered period. The three groups of earthquakes are 
identified as aftershocks: a) the prominent earthquake clus- 
ter in the immediate vicinity of the Landers' epicenter; b) 
the "secondary" aftershocks after the Big Bear earthquake, 
M=6.4, which itself is the largest aftershock of Landers; c) 
several earthquakes that occurred immediately after Lan- 
ders but at large distance from the latter. Such "distant" af- 
tershocks present a special interest in many seismic studies. 
Both Northridge and Hector Mine aftershock clusters have 
not been associated with Landers. We emphasize though 
existence of a distant Landers' aftershock close to the fu- 
ture epicenter of Hector Mine. 




FIG. 5: (Color online) Aftershock identification for Landers 
earthquake (lime 28, 1992, M7.3). The figure shows all earth- 
quakes that occurred after the Landers. Shaded circles mark 
earthquakes identified as Landers' aftershocks; open circles mark 
the rest of earthquakes. 



CONCLUSION AND DISCUSSION 

We demonstrated the existence of statistically distinct 
clustered and non-clustered parts in the observed seismic - 



ity. This finding has important implications for various 
problems, aftershock detection being the most prominent 
one. The physical interpretation of the reported separation 
as well as its further applications will be considered in a 
forthcoming paper. 

The current definition of the distance r] remains ad hoc; 
a partial justification for this choice is provided by our re- 
sult on the distribution for rj* (Proposition 2), which co- 
incides with the Euclidean nearest-neighbor distance dis- 
tribution for a homogeneous (unmarked) point field. An 
analog of Proposition 2 is readily proven for any nearest- 
neighbor distance that depends multiplicatively on spatio- 
temporal point location and multidimensional mark m: 
rj = rr d /(m). It would be interesting to see how al- 
ternative definitions of rj will alter the applied part of the 
proposed clustering analysis. 
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